Despite its nonconvex nature, $\ell_0$ sparse approximation is desirable in many theoretical and application cases. We study the $\ell_0$ sparse approximation problem with the tool of deep learning, by proposing Deep $\ell_0$ Encoders. Two typical forms, the $\ell_0$ regularized problem and the $M$-sparse problem, are investigated. Based on solid iterative algorithms, we model them as feed-forward neural networks, through introducing novel neurons and pooling functions. Enforcing such structural priors acts as an effective network regularization. The deep encoders also enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions. Furthermore, under task-driven losses, the models can be conveniently optimized from end to end. Numerical results demonstrate the impressive performances of the proposed encoders.